Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL Spec][Joint Matrix] Add a new overload for joint_matrix_apply to be able to return result into a different matrix #13153

Merged
merged 6 commits into from
Sep 20, 2024

Conversation

dkhaldi
Copy link
Contributor

@dkhaldi dkhaldi commented Mar 25, 2024

Currently, CUDA code that use this pattern:
for (int i = 0; i < c_frag.num_elements; i++) {
c_frag.x[i] = alpha * acc_frag.x[i] + beta * c_frag.x[i];
}
cannot be migrated to SYCL joint matrix.
This added overload addresses this limitation.

…o be able to return result into a different matrix
@dkhaldi
Copy link
Contributor Author

dkhaldi commented Mar 29, 2024

@gmlueck, can you please review this?

sarnex pushed a commit that referenced this pull request Mar 29, 2024
…able to return result into a different matrix (#13151)

Currently, CUDA code that use this pattern:
  for (int i = 0; i < c_frag.num_elements; i++) {
    c_frag.x[i] = alpha * acc_frag.x[i] + beta * c_frag.x[i];
  }
cannot be migrated to SYCL joint matrix.
This added overload addresses this.
Spec API is added here #13153
@AlexeySachkov
Copy link
Contributor

@dkhaldi, a friendly reminder about this PR. Without it, we have an undocumented public APIs in our implementation (#13151), which is never a good idea.

@dkhaldi
Copy link
Contributor Author

dkhaldi commented Sep 9, 2024

@dkhaldi, a friendly reminder about this PR. Without it, we have an undocumented public APIs in our implementation (#13151), which is never a good idea.

@AlexeySachkov, Yes it is on my TODO list. I will give it higher priority to work on it this week

@dkhaldi
Copy link
Contributor Author

dkhaldi commented Sep 20, 2024

@intel/llvm-gatekeepers, please help merge this.

@steffenlarsen steffenlarsen merged commit 97dba84 into intel:sycl Sep 20, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants